Feature Engineering for Time Series

  • By describing a time series not with a series of numbers detailing the step-by-setp outputs of a process but rather by describing it with a set of features, we can access ML methods designed for cross sectional data
  • The features can be computed over the entire time series of as rolling or expanding window functions

Considerations for extracting features from Time series

  • Stationarity
  • Length of the time series - features may become unstable as the length of time series increases
  • Domain knowledge
  • External considerations

Catalog of common features

  • Mean and variance
  • Maximum and minimum
  • Difference between last and first values
  • Number of local maxima and minima
  • Smoothness of the time series
  • Periodicity and autocorrelation of the time series

Packages for feature generation

  • tsfresh
  • The following categories of features are computed
    • Descriptive statistics
    • Physics inspired category of indicators - nonlinearity (C3), complexity(cid_ce), friedrich_coefficients(returns coefficients of a model fitted to describe complex nonlinear motion) etc
    • History-compressing counts

*Cesium. This library also has a web-based GUI for feature generation

Feature selection

  • Automatic feature selection based on automatic feature generation
  • FRESH - feature extraction based on scalable hypothesis tests - Implemented in tsfresh
  • Recursive Feature Elimination (RFE) can be used (forward, backward methods)